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Abstract — This  paper  develops  optimal  strategy  for  opportunis¬ 
tic  spectrum  access  (OSA)  by  integrating  the  design  of  spectrum 
sensor  at  the  physical  layer  with  that  of  spectrum  sensing  and 
access  policies  at  the  medium  access  control  (MAC)  layer.  The 
design  objective  is  to  maximize  the  throughput  of  secondary  users 
while  limiting  their  probability  of  colliding  with  primary  users. 
By  exploiting  the  rich  structures  of  the  problem,  we  establish 
a  separation  principle:  the  design  of  spectrum  sensor  and  access 
policy  can  be  decoupled  from  that  of  sensing  policy  without  losing 
optimality.  This  separation  principle  enables  us  to  obtain  closed- 
form  optimal  sensor  operating  characteristic  and  access  policy, 
leading  to  significant  complexity  reduction.  It  also  allows  us  to 
study  the  inherent  interaction  between  spectrum  sensor  and  access 
policy  and  the  tradeoff  between  false  alarm  and  miss  detection  in 
opportunity  identification. 

I.  Introduction 

The  exponential  growth  in  wireless  services  and  the  physical 
limit  on  usable  radio  frequencies  have  motivated  the  develop¬ 
ment  of  dynamic  spectrum  sharing  and  allocation  technologies. 
Opportunistic  spectrum  access  (OSA),  first  envisioned  by  Mi- 
tola  [1]  and  then  investigated  by  the  DARPA  XG  program  [2], 
has  received  great  attention  [3]  due  to  its  potential  in  improving 
spectrum  efficiency.  The  idea  of  OSA  is  to  allow  secondary 
users  to  identify,  search  for,  and  use  instantaneous  spectrum 
opportunities  in  a  manner  that  limits  the  level  of  interference 
perceived  by  primary  users.  Correspondingly,  there  are  three 
basic  components  of  OSA:  1)  a  spectrum  sensor  at  the  physical 
layer  that  identifies  spectrum  opportunities;  2)  a  sensing  policy 
at  the  medium  access  control  (MAC)  layer  that  specifies  which 
channels  to  sense;  and  3)  an  access  policy  at  the  MAC  layer 
that  determines  the  subset  of  channels  on  which  to  transmit 
based  on  the  sensing  outcomes. 

In  this  paper,  we  aim  at  optimal  OSA  by  integrating  the 
design  of  spectrum  sensor  at  the  physical  layer  with  that  of 
sensing  and  access  policies  at  the  MAC  layer.  The  objective 
is  to  maximize  the  throughput  of  secondary  users  under  the 
constraint  that  the  probability  of  collision  perceived  by  primary 
users  is  below  a  certain  threshold.  We  formulate  the  joint  OSA 
design  as  a  constrained  partially  observable  Markov  decision 
process  (POMDP),  which  often  requires  randomized  policies 
to  achieve  optimality.  By  exploiting  the  underlying  structure 
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of  the  problem,  we  establish  a  separation  principle:  the  design 
of  spectrum  sensor  and  access  policy  can  be  decoupled  from 
that  of  sensing  policy  without  losing  optimality.  This  separa¬ 
tion  principle  allows  us  to  obtain  closed-form  optimal  sensor 
operating  point  and  access  policy,  and  reduce  a  constrained 
POMDP  to  an  unconstrained  one,  leading  to  deterministic 
optimal  sensing  policy.  It  also  enables  us  to  study  the  inherent 
interaction  between  spectrum  sensor  and  access  policy  and 
obtain  the  best  tradeoff  between  false  alarm  and  miss  detection 
in  spectrum  opportunity  identification. 

Related  Work  Differing  from  this  paper  that  mainly  addresses 
the  exploitation  of  temporal  spectrum  opportunities  resulting 
from  the  bursty  traffic  of  primary  users,  a  majority  of  existing 
work  (see  [4] — [6]  and  references  therein)  focus  on  geographic 
spectrum  opportunities  that  are  static  or  slowly  varying  in  time. 
The  application  being  considered  is  a  network  of  geographically 
distributed  secondary  users,  each  affected  by  a  different  set  of 
primary  users  whose  spectrum  access  activities  are  considered 
static  over  a  long  period  of  time.  The  design  objective  is  to 
allocate  these  spatially  varying  spectrum  opportunities  among 
secondary  users  so  that  the  network-level  spectrum  efficiency 
is  maximized  subject  to  some  regulatory  constraint  on  interfer¬ 
ence  to  primary  users.  Due  to  the  slow  temporal  variation  of 
spectrum  occupancy,  opportunity  identification  is  not  as  critical 
a  component  in  this  class  of  applications,  and  the  existing  work 
often  assumes  perfect  knowledge  of  spectrum  opportunities  in 
the  whole  spectrum  at  any  location. 

Research  efforts  have  also  been  made  to  exploit  temporal 
spectrum  opportunities  [7]— [9]  under  perfect  spectrum  sensing. 
However,  in  the  presence  of  noise  and  fading,  sensing  errors 
will  occur.  Hence,  the  design  of  sensing  and  access  policies 
should  take  into  account  the  operating  characteristics  of  the 
spectrum  sensor.  A  heuristic  approach  has  been  proposed  in 
[9]  for  OSA  in  the  presence  of  sensing  errors.  This  paper 
develops  a  mathematical  framework  for  the  optimal  joint  design 
of  spectrum  sensor  at  the  physical  layer  and  sensing  and  access 
policies  at  the  MAC  layer. 

II.  Network  Model 

Consider  a  spectrum  that  consists  of  N  channels,  each 
with  bandwidth  Bn  (n  =  l,---  ,N).  These  N  channels  are 
licensed  to  a  slotted  primary  network.  We  model  the  spectrum 
occupancy  of  primary  users  by  a  discrete-time  Markov  process 
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whose  states  are  defined  as  S (t)  =  [Si(t), ,  S^(t)],  where 
Sn  (t)  G  {0  (busy),  1  (idle)}  denotes  the  availability  of  channel 
n  in  slot  t.  The  state  space  S  of  this  Markov  process  is  thus 
given  by  S  =  {0,1}^.  The  probability  that  the  spectrum 
occupancy  transits  from  state  s  to  state  s'  is  denoted  by  Ps,s'- 
We  consider  a  group  of  secondary  users  seeking  spectrum 
opportunities  in  these  N  channels.  We  focus  on  an  ad  hoc 
network  where  secondary  users  sense  and  access  the  spectrum 
independently.  At  the  beginning  of  each  slot,  a  secondary  user 
with  data  to  transmit  chooses  at  most  L  channels  to  sense 
and  then  to  decide  whether  to  access  these  channels  based 
on  the  sensing  outcomes.  Such  spectrum  sensing  and  access 
decisions  are  made  to  maximize  its  own  throughput  while 
limiting  the  interference  to  primary  users  by  fully  exploiting 
the  sensing  history  and  the  spectrum  occupancy  dynamics. 
When  the  secondary  user  decides  to  transmit,  it  generates  a 
random  backoff  time,  and  transmits  when  this  timer  expires 
and  no  other  secondary  user  has  already  accessed  the  channel 
during  the  backoff  time.  At  the  end  of  the  slot,  the  receiver 
acknowledges  a  successful  data  transmission.  The  basic  slot 
structure  is  illustrated  in  Fig.  1.  Details  on  implementation  can 
be  found  in  [9].  In  this  paper,  we  assume  L  =  1.  Extension  to 
L  >  1  will  be  discussed  in  the  upcoming  journal  paper. 

Our  goal  is  to  jointly  optimize  the  spectrum  sensor  and 
the  sensing/access  policies  to  maximize  the  throughput  of 
the  secondary  user  in  T  slots  under  the  constraint  that  the 
probability  Pa(t)  of  collision  perceived  by  the  primary  network 
in  any  channel  and  any  slot  is  below  a  threshold  £,  i.e., 
Pa(t)  =  Prjcollision  \Sa(t)  =  0}  <  (  for  any  channel  a  and 
slot  t.  Note  that  the  maximum  allowed  collision  probability  (j 
is  generally  specified  by  the  primary  network. 

III.  Problem  Formulation 
In  this  section,  we  formulate  the  optimal  joint  OSA  design 
as  a  constrained  POMDP  problem.  Involved  in  the  design  are 
three  basic  components:  a  spectrum  sensor,  a  sensing  policy, 
and  an  access  policy. 

A.  Spectrum  Sensor 

The  spectrum  sensor  of  a  secondary  user  detects,  at  the 
beginning  of  each  slot,  the  availability  of  the  chosen  channel. 
It  can  be  considered  as  performing  a  binary  hypotheses  test: 
Ho  (null  hypothesis  indicating  that  the  sensed  channel  is  idle) 
vs.  Hi  (alternative).  Let  0a  be  the  sensing  outcome  (the  result 
of  the  hypotheses  test):  0a  =  1  (idle)  and  0a  =  0  (busy). 

If  the  sensor  mistakes  Ho  for  Hi,  a  false  alarm  occurs,  and  a 
spectrum  opportunity  is  overlooked  by  the  sensor.  On  the  other 
hand,  when  the  sensor  mistakes  Hi  for  Ho,  we  have  a  miss 
detection.  Let  e=  Pr{0a  =  0  |  Sa  =  1}  and  <5=  Pr{0a  = 

F~1  Spectrum  Sensing 
S3  Data  Transmission 
I  I  Acknowledgement 

Fig.  1.  The  slot  structure. 


1 1  Sa  =  0}  denote,  respectively,  the  probabilities  of  false  alarm 
and  miss  detection.  The  performance  of  a  sensor  is  specified  by 
the  receiver  operating  characteristics  (ROC)  curve  which  gives 
the  probability  of  detection  1  —  5  as  a  function  of  e  (an  example 
is  given  in  Fig.  2).  We  point  out  that  analyzing  the  ROC  curve  of 
the  spectrum  sensor  in  a  wireless  network  environment  can  be 
complex.  We  assume  here  that  the  ROC  curve  of  the  spectrum 
sensor  has  already  been  obtained,  and  we  focus  on  the  tradeoff 
between  false  alarm  and  miss  detection.  Specifically,  we  seek 
answer  to  the  question  which  point  5  on  the  given  ROC  curve 
the  spectrum  sensor  should  operate  at. 

If  the  secondary  user  completely  trusts  the  sensing  outcome 
in  decision-making,  false  alarms  result  in  wasted  spectrum 
opportunities  while  miss  detections  lead  to  collisions  with 
primary  users.  To  optimize  the  performance  of  the  secondary 
user  while  limiting  its  interference  to  the  primary  network,  we 
should  carefully  choose  the  sensor  operating  point.  Meanwhile, 
the  spectrum  access  decisions  should  be  made  by  taking  into 
account  the  sensor  operating  characteristics.  A  joint  design  of 
the  spectrum  sensor  at  the  physical  layer  and  the  access  policy 
at  the  MAC  layer  is  thus  necessary  to  achieve  optimality. 

B.  Sensing  and  Access  Policies 

The  sensing  policy  specifies,  in  each  slot,  which  channel  to 
sense,  and  the  access  policy  determines  whether  to  transmit 
based  on  the  sensing  outcome.  At  the  beginning  of  a  slot, 
a  secondary  user  with  data  to  transmit  chooses  a  channel 
a  G  {l,...,iV}  to  sense.  Based  on  the  sensing  outcome 
0a,  the  secondary  user  decides  whether  to  transmit  over  the 
sensed  channel:  cI>a  G  (0  (no  access),  1  (access)}.  At  the 
end  of  the  slot,  the  receiver  acknowledges  a  successful  data 
transmission:  Ka  G  {0  (unsuccessful),  1  (successful)}.  Note 
that  an  acknowledgement  Ka  =  1  is  obtained  if  and  only  if 
the  secondary  user  chooses  to  access  <Fa  =  1  and  the  channel 
is  idle  Sa  =  1,  ie.,  Ka  =  l[ga=1  ,$a=1j.  A  reward 
is  accrued  depending  on  Ka.  Assuming  that  the  number  of 
information  bits  that  can  be  transmitted  is  proportional  to  the 
channel  bandwidth,  we  define  the  reward  R^®^  obtained  by 
choosing  sensing  and  access  action  (a,  cI>a)  as: 

R(a,y  =  RaBa  (1) 

Due  to  partial  spectrum  monitoring  and  sensing  errors, 
the  secondary  user  and  the  receiver  cannot  directly  observe 
the  current  state  of  the  spectrum  occupancy.  We  thus  have 
a  POMDR  It  has  been  shown  in  [11]  that  the  knowledge 
of  the  current  spectrum  occupancy  state  based  on  all  past 
decisions  (i.e.,  sensing  and  access  actions)  and  observations 
(i.e.,  acknowledgements)  can  be  summarized  by  a  belief  state 
A(f)  =  (As(f)}seS-  Each  element  As(f)  of  the  belief  state 
A (i)  is  the  conditional  probability  (given  the  decision  and 
observation  history)  that  the  current  spectrum  occupancy  state 
is  given  by  S  (t)  =  s  prior  to  the  state  transition  in  slot  t. 
Hence,  a  sensing  policy  7rs  is  given  by  a  sequence  of  functions: 
7TS  =  [mi,  . . .  ,pT]  where  p,t  ■  [0,  l]|,s|  ->  (1, . . . ,  A^}  maps 
the  belief  state  A(t.)  G  [0,  l}5  at  the  beginning  of  slot  t  to 
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a  channel  a  £  {1  to  be  sensed.  An  access  policy 

7 rc  is  given  by  a  sequence  of  functions:  nc  =  [v\ , . . . ,  vt] 
where  vt  '■  [0, 1]  x  {0,1}  — >  {0,1}  maps  the  belief  state 
A (£)  £  [0,  l]lsl  and  the  sensing  outcome  0Q  £  {0,1}  of  the 
chosen  channel  a  to  an  access  action  £  {0, 1}. 

In  this  paper,  we  assume  that  the  state  transition  probabilities 
{Ps,s' }  of  the  underlying  Markov  model  are  known.  If  the  tran¬ 
sition  probabilities  are  unknown,  formulations  and  algorithms 
for  POMDP  with  an  unknown  model  exist  in  the  literature  [12] 
and  can  be  applied  to  our  problem.  In  Section  V,  we  study  the 
impact  of  mismatched  Markov  model  on  the  OSA  performance. 


C.  Design  Objective 

We  aim  to  determine  the  optimal  sensor  operating  point 
5  and  the  optimal  sensing  and  access  policies  The 

objective  is  to  maximize  the  throughput  of  the  secondary  user 
(or  equivalently  the  total  expected  reward)  in  T  slots  under  the 
collision  constraint: 


{(5*,7T*,7r*}  =  arg  max  E{(5lr  ^ 

d,7Ts,7rc 


A(l) 


s.t.  Pa(t)  =  Pr{<ba(t)  =  1 1  Sa(t)  =  0,  A (£)}  <  C  holds 
for  any  a  and  t  such  that  Pr{Sa(f)  =  0  |  A(f)}  >  0,  (2) 


where  ^{s,7r.,,na}  is  the  expectation  given  that  sensing  and 
access  policies  {7Ts,7rc}  are  employed  and  sensor  operates  at 
point  5,  A(l)  is  the  initial  belief  state  which  is  usually  given 
by  the  stationary  distribution  of  the  spectrum  occupancy  states. 
Note  that  when  Pr{Sa(t)  =  0|A(£)}  =  0,  i.e.,  channel  a 
is  available  with  probability  1  in  slot  t,  the  constraint  in  (2) 
becomes  irrelevant  and  the  secondary  user’s  access  decision 
is  simply  <hQ(f)  =  1.  In  the  rest  of  this  paper,  we  consider 
the  non-trivial  case  where  Pr {Sa(t)  =  0  |  A(£)}  >  0  in  any 
channel  a  and  slot  t. 


IV.  Separation  Principle  for  Optimal  Joint  Design 

The  design  objective  given  in  (2)  is  a  constrained  POMDP, 
which  usually  requires  randomized  policies  to  achieve  optimal¬ 
ity.  In  this  case,  a  sensing  policy  determines  the  mapping  from 
the  current  belief  state  to  the  probability  of  choosing  each 
channel  and  an  access  policy  the  mapping  from  the  current 
belief  state  to  the  transmission  probabilities  under  different 
sensing  outcomes.  Since  there  exist  uncountably  many  prob¬ 
ability  distributions,  randomized  policies  are  computationally 
prohibitive.  In  this  section,  we  establish  a  separation  principle 
for  the  optimal  joint  design.  This  separation  principle  reveals 
the  existence  of  deterministic  optimal  sensing  and  access 
policies,  leading  to  significant  complexity  reduction.  It  also 
enables  us  to  obtain  closed-form  optimal  sensor  operating  point 
at  which  the  best  tradeoff  between  false  alarms  and  miss 
detections  is  achieved. 

A.  The  Impact  of  Sensor  Operating  Point  on  Access  Policy 

Let  (A  (£),£)  be  the  probability  of  transmitting  over  chosen 
channel  a  given  sensing  outcome  0a  =  6  and  belief  state  A(f) 
at  the  beginning  of  slot  t.  In  Theorem  1,  we  derive  closed-form 


optimal  transmission  probabilities  (/}(A(£),  t),  f®(A(t),t))  for 
different  given  sensor  operating  points  6. 

Theorem  1:  The  optimal  access  policy  is  time-invariant  and 
belief-independent.  Specifically,  the  optimal  transmission  prob¬ 
abilities  are  solely  determined  by  the  sensor  operating  point  5 
and  the  maximum  allowed  probability  of  collision  £,  i.e.,  for 
any  chosen  channel  a,  belief  state  A  (t),  and  slot  t,  we  have 


(/i(A(£),£),/°(A(£),£)) 


(1,0), 


5<C, 

<5  =  c,  (3) 
s>C- 


Proof:  See  [10]  for  details.  II 

Theorem  1  enables  us  to  study  the  impact  of  sensor  operating 
characteristics  on  the  optimal  access  policy.  As  illustrated  in 
Fig.  2,  the  ROC  curve  can  be  partitioned  into  two  regions: 
the  “conservative”  region  ( 5  >  Q  and  the  “aggressive”  region 
(5  <  Q.  When  <5  >  £,  the  spectrum  sensor  is  more  likely  to 
misidentify  an  opportunity  (i.e.,  a  busy  channel  is  sensed  to 
be  idle).  Hence,  the  access  policy  should  be  conservative  to 
ensure  that  the  probability  of  collision  is  bounded  below  C- 
Specifically,  even  when  the  sensing  outcome  0a  =  1  indicates 
that  the  channel  is  available,  the  user  should  only  transmit 
with  probability  I  <  1.  When  the  channel  is  sensed  to  be 
busy:  0a  =  0,  the  user  should  trust  the  sensing  outcome  and 
refrain  from  transmission.  On  the  other  hand,  when  5  <  £,  the 
spectrum  sensor  is  more  likely  to  overlook  an  opportunity  (i.e., 
an  idle  channel  is  sensed  to  be  busy).  Hence,  the  user  should 
adopt  an  aggressive  access  policy:  always  transmit  when  the 
channel  is  sensed  to  be  available  and  transmit  with  probability 
>  0  even  when  the  channel  is  sensed  to  be  busy.  When 
5  =  C,  the  optimal  access  policy  is  deterministic:  always  trust 
the  sensing  outcome. 


B.  The  Separation  Principle 

Given  belief  state  A(i)  at  the  beginning  of  slot  t,  we  can 
rewrite  the  design  constraint  in  (2)  as 

l 

Pa(t)  =  =  1 1  0a  =  9}  Pr{0a  =  e\  Sa(t)  =  0} 

0=0 

=  6f1a(A(t),t)  +  (l-S)fa(A(tft).  (4) 
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Careful  inspection  of  (3)  and  (4)  reveals  that  the  constraint 
given  in  (2)  is  satisfied  regardless  of  the  chosen  channel.  We 
thus  have  a  separation  principle  (Theorem  2)  for  the  optimal 
joint  OSA  design,  which  decouples  the  design  of  spectrum 
sensor  and  access  policy  from  that  of  sensing  policy.  Following 
this  separation  principle,  we  obtain  closed-fonn  optimal  sensor 
operating  point  5*  and  access  policy  7r*  in  Theorem  3. 
Theorem  2:  Separation  Principle 

The  joint  design  of  OSA  formulated  in  (2)  can  be  obtained 
in  two  steps  without  losing  optimality.  First,  choose  sensor 
operating  point  5  and  access  policy  nc  according  to  (3) 
to  maximize  the  expected  immediate  reward.  Second,  choose 
sensing  policy  7rs  to  maximize  the  expected  total  reward. 

Proof:  See  [10]  for  details.  I 

Theorem  3:  The  optimal  sensor  operating  point  is  5*  =  (. 
The  optimal  access  policy  n*  is  given  by  $*  =  Oa. 

Proof:  See  [10]  for  details.  I 

Theorem  3  reveals  the  existence  of  deterministic  optimal 
access  policy  for  the  constrained  POMDP  given  in  (2).  Specifi¬ 
cally,  the  optimal  access  policy  7r*  is  to  simply  trust  the  sensing 
outcome:  <3>*  =  0a,  i.e.,  access  if  and  only  if  the  channel  is 
detected  to  be  available. 


C.  The  Optimal  Sensing  Policy 

In  Theorem  3,  we  have  obtain  the  optimal  sensor  operating 
point  and  the  optimal  access  policy  n* .  Since  and  7r*  have 
been  chosen  to  ensure  the  constraint  regardless  of  the  chosen 
channel,  we  are  free  to  search  for  the  optimal  sensing  policy  7r* 
over  the  whole  design  space.  The  design  of  the  sensing  policy 
thus  becomes  an  unconstrained  POMDP,  where  optimality  can 
be  achieved  by  deterministic  policies. 

Let  V](A  (£))  denote  the  maximum  total  expected  reward 
obtained  from  slot  1  <  f  <  T  given  the  belief  state  A  (£)  at  the 
beginning  of  slot  t.  Given  sensor  operating  point  S*  and  access 
policy  7r*,  we  obtain  Vj(A(£))  recursively  by 

i 

ft(A(£))  =  max  EEAs'(i)^,s£Qs(M 

s£«S  s'£<S  ka=0 

x  [ kaBa  +  Vt+1(T(A(t)  I  a,ka))\,  1  <t<T, 

Vt(A (T))  =  max  V  V  Asft)Ps^sQs(l)Ba,  (5) 

a  z ' 
s€<S  s'G«S 


where  Qs(0)  =  1  -  Qs(l),  Qs(l)  =  Pr{Aa  =  1 1  S(£)  =  s}  = 
l[Sa=1]  (1  —  e*)  is  the  probability  of  successful  transmission  un¬ 
der  current  spectrum  occupancy  state  S(f)  =  s  =  [si, . . . ,  Sjv]. 
Note  that  l[Sa=i]  indicates  whether  channel  a  is  idle  given 
S(f)  =  s  and  e*  is  the  probability  of  false  alann  that  can  be 
achieved  when  the  spectrum  sensor  operates  at  5*.  The  updated 
belief  state  A(t  +  1)  =  T(A(f)  |  a,  ka)  can  be  obtained  via 
Bayes  rule  as 


A  s(t  +  1) 


Es'e-S  As'(^)-Ps',sQs(fca) 
Es"es  Es'eS  ^s'(t)Ps'  ,s"Qs"{ka) 


(6) 


The  optimal  sensing  policy  7r*  can  be  obtained  by  solving 
the  optimality  equation  given  in  (5).  It  is  shown  in  [11]  that 


Vt(A(t))  is  piecewise  linear  and  convex,  leading  to  a  linear 
programming  procedure  for  calculating  7r* . 


V.  Simulation  Examples 


In  this  section,  we  provide  two  simulation  examples  to  study 
the  impacts  of  sensor  operating  point  <5  and  mismatched  Markov 
model  on  the  performance  of  the  optimal  OSA.  We  consider 
N  =  3  independently  evolving  channels  with  the  same  band¬ 
width  Bn  =  1.  As  illustrated  in  Fig.  3,  the  state  transition  of 
spectrum  occupancy  can  be  characterized  by  a  =  [ai,  012, 0:3] 
and  (3  =  [/?i,  /?2,  Pz\,  where  an  denotes  the  probability  that 
channel  n  transits  from  state  0  (busy)  to  state  1  (idle)  and  /3n 
denotes  the  probability  that  it  stays  in  state  1.  We  assume  that 
the  spectrum  occupancy  dynamics  remain  unchanged  during 
T  =  10  slots.  The  throughput  of  the  secondary  user  is  measured 
by  the  expected  total  reward  per  slot,  i.e.,  Vj(A(l))/T,  where 
A(l)  is  given  by  the  stationary  distribution  of  the  underlying 
Markov  process. 

At  the  beginning  of  each  slot,  the  spectrum  sensor  takes  M 
measurements  {Yj}^  of  the  chosen  channel.  We  assume  that 
both  the  channel  noise  and  the  signal  of  primary  users  can  be 
modeled  as  white  Gaussian  processes  M.  Then,  the  spectrum 
sensor  performs  the  following  hypotheses  test: 

(  Ho  (idle  channel)  :  Y  ~  Af(0,  erg),  i  =  1,  •  ■  •  ,  M, 

|  Hi  (busy  channel)  :  Yt  ~  Jf(0,  of),  i  =  1,  •  •  •  ,  M, 

where  is  the  noise  power  and  af  is  the  primary  signal  power. 
It  can  be  readily  shown  that  the  energy  detector  is  optimal  under 
Neyman-Pearson  (NP)  criterion  [13,  Sec.  2.6.2]: 


EY^nl  V, 


(7) 


where  the  threshold  7]  determines  the  false  alarm  and  miss 
detection  rates  of  the  detector.  The  ROC  curve  of  the  energy 
detector  is  given  by  [13,  Sec.  2.6.2] 


1  —  5  =  1  —  7 


(8) 


where  (of  —  o-^/cr'o  is  the  SNR  and  7  (n,a)  = 

T7n)  fo^n~le~tdt  is  the  incomplete  gamma  function.  In 
all  the  figures,  we  assume  M  =  10  and  SNR  =  5dB. 

Fig.  4  studies  the  impact  of  sensor  operating  point  <5  on 
the  throughput  and  the  optimal  access  policy  of  the  secondary 
user.  The  upper  figure  plots  the  maximum  throughput  of 
the  secondary  user  for  each  given  sensor  operating  point  5. 
The  optimal  access  policy  is  specified  by  the  transmission 
probabilities  (/„,/„),  which  are  shown  in  the  middle  and 


Fig.  3.  The  Markov  channel  model 
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Fig.  4.  The  impact  of  spectrum  sensor  operating  point  on  the  throughput  of 
the  secondary  user,  a  =  [0.2, 0.4, 0.6],  (3  =  [0.8, 0.6, 0.4],  £  =  0.05. 


the  lower  figures,  respectively.  We  can  see  that  the  maximum 
throughput  is  achieved  at  S*  =  (  =  0.05  and  the  transmission 
probabilities  change  with  S  as  given  by  Theorem  1.  Interest¬ 
ingly,  the  throughput  curve  is  concave  with  respect  to  S  in  the 
“aggressive”  region  (<5  <  Q  and  convex  in  the  “conservative” 
region  (<5  >  Q.  The  performance  thus  degrades  at  a  faster  rate 
when  the  sensor  operating  point  drifts  toward  the  “conservative” 
region.  This  suggests  that  miss  detections  (which  lead  to 
collisions)  are  more  harmful  to  the  performance  of  OSA  than 
false  alarms  (which  represent  missed  opportunities). 


Relative  Error  \\i  in  Transition  Prob.  (%) 
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Fig.  5.  The  impact  of  inaccurate  transition  probabilities  on  the  throughput  of 
the  secondary  user,  oc  =  [0.2, 0.4, 0.6],  (3  =  [0.8, 0.6, 0.4],  £  =  0.05. 

In  Fig.  5,  we  study  the  impact  of  mismatched  Markov  model 
on  the  performance  of  the  optimal  OSA.  We  assume  that 
the  spectrum  occupancy  evolves  according  to  the  transition 
probabilities  given  by  a  and  (3  while  the  secondary  user 
employs  the  optimal  OSA  policy  based  on  inaccurate  transition 
probabilities  a!  and  (31 .  In  the  upper  figure,  we  plot  the  relative 
throughput  loss  of  the  secondary  user  as  a  function  of  the 
relative  error  ip  in  transition  probabilities  which  is  given  by 
ip  =  an~0‘n  x  100%  =  —  np  13 71  x  100%.  Clearly,  the  maximum 


throughput  is  achieved  when  the  relative  error  is  zero  (i.e.,  the 
secondary  user  has  accurate  information  on  transition  probabil¬ 
ities).  Inaccurate  transition  probabilities  can  cause  performance 
loss.  We  find  that  the  relative  performance  loss  is  below  4% 
even  when  the  absolute  relative  error  is  up  to  20%.  In  the  lower 
figure,  we  examine  the  probability  of  collision  perceived  by  the 
primary  network.  We  find  that  the  probability  of  collision  is 
not  affected  by  mismatched  transition  probabilities.  The  reason 
behind  this  observation  is  the  separation  principle:  the  optimal 
sensor  operating  point  and  the  optimal  access  policy,  which 
determine  the  probability  of  collision,  are  independent  of  the 
spectrum  occupancy  dynamics. 

VI.  Conclusion 

In  this  paper,  we  took  a  cross-layer  approach  to  OSA  design. 
By  jointly  optimizing  the  spectrum  sensor  at  the  physical  layer 
and  the  sensing/access  policies  at  the  MAC  layer,  we  developed 
optimal  OSA  strategy  that  maximizes  the  throughput  of  the 
secondary  user  under  a  constraint  on  the  collision  probability 
perceived  by  the  primary  network.  By  exploiting  the  rich 
structure  of  the  problem,  we  established  a  separation  principle 
for  the  optimal  joint  design,  which  decouples  the  design  of 
spectrum  sensor  and  access  policy  from  that  of  sensing  policy. 
We  studied  the  impact  of  sensor  operating  characteristics  on 
the  access  policy  and  the  tradeoff  between  false  alarm  and  miss 
detection  in  spectrum  opportunity  identification.  We  observed 
that  miss  detections  are  more  harmful  to  the  performance  of 
OSA  than  false  alarms. 
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